SAPPHIRINE: Sensor-based Analysis of Pollution in the Philadelphia Region with Information on Neighborhoods and the Environment

SAPPHIRINE integrates pollution and geospatial data relevant to investigators, citizen scientists, and policy makers in the Greater Philadelphia Area. SAPPHIRINE’s capabilities include providing interactive maps and customizable data retrieval to aid in the visual identification of pollution and other factor hotspots as well as hypothesis generation regarding relationships among these factors and health outcomes. Data for pollution originate from AirCasting and PurpleAir crowdsourced databases (http://aircasting.org/mobile_map, https://www.purpleair.com/sensorlist), and data for crime, Area Deprivation Index, and traffic originate from OpenDataPhilly (https://www.opendataphilly.org/dataset/crime-incidents), Neighborhood Atlas (https://www.neighborhoodatlas.medicine.wisc.edu/), and PennDOT (https://data-pennshare.opendata.arcgis.com/datasets/rmstraffic-traffic-volumes/data), respectively. Data for pollution were also colected with AirBeam sensors in a pilot study by the Himes Lab, funded by CEET (http://ceet.upenn.edu/). EPA pollution data were downloaded from the EPA Air Data Portal (https://aqs.epa.gov/aqsweb/airdata/download_files.html).

The sapphirine package comes with 2 comprehensive data sets: local.data, which includes sensor-based data for temperature, humidity, PM1, PM2.5, and PM10 as well as crime data; and EPA.data, which includes data for PM2.5, PM10, SO2, NO2, O3, and CO measured at EPA monitoring stations. Data for Traffic and Area Deprivation Index are pre-rendered in raster format and can be accessed from traffic.raster and ADI_data, respectively.

Install the prerequisite R packages if they do not exist:

library(devtools)
if('sapphirine' !%in% installed.packages()){
  devtools::install_github("HimesGroup/sapphirine", subdir = "pkg/sapphirine")
}

Load package:

library(sapphirine)

selectGPACounties

Select subset of of Greater Philadelphia Area counties to generate a custom shapefile for data subsetting and raster generation. Call sapphirine::GPACountyNames for a list of county names.

counties <- c('Bucks', 'Chester', 'Delaware', 'Montgomery', 'Philadelphia')

cty.shape <- selectGPACounties(counties)

raster::plot(cty.shape)

customLocalData

Subset local.data according to custom parameters. Here, we include data within the counties selected above and measured between 1 June 2017 - 31 May 2019 and between 16:00-18:00 U.S. Eastern Time each day.

my.local.data <- customLocalData(cty.shape, '2017-06-01', '2019-05-31', '16:00', '18:00')

head(data.frame(my.local.data))
##             Timestamp Latitude Longitude            Sensor.ID Temperature
## 1 2017-11-14 17:03:30 39.94724 -75.39016 AirBeam:00189610619E          15
## 2 2017-11-14 17:03:40 39.94723 -75.39022 AirBeam:00189610619E          15
## 3 2017-11-14 17:03:50 39.94723 -75.39024 AirBeam:00189610619E          15
## 4 2017-11-14 17:04:00 39.94730 -75.39030 AirBeam:00189610619E          15
## 5 2017-11-14 17:04:10 39.94732 -75.39032 AirBeam:00189610619E          15
## 6 2017-11-14 17:04:20 39.94732 -75.39032 AirBeam:00189610619E          15
##   Humidity PM1  PM2.5 PM10 Crime Count
## 1     43.0  NA 10.560   NA    NA     1
## 2     42.8  NA  9.396   NA    NA     1
## 3     42.0  NA  8.283   NA    NA     1
## 4     42.0  NA  7.848   NA    NA     1
## 5     41.8  NA  7.166   NA    NA     1
## 6     41.0  NA  6.728   NA    NA     1

localRaster

Create a brick of raster layers corresponding to local.data variables. Here, we use the local dataset and shapefile created above and use 100x100 resolution.

local.ras <- localRaster(my.local.data, cty.shape, 100, 100)

methods::show(local.ras)
## class      : RasterBrick 
## dimensions : 100, 100, 10000, 13  (nrow, ncol, ncell, nlayers)
## resolution : 0.01414947, 0.00887024  (x, y)
## extent     : -76.13645, -74.7215, 39.72156, 40.60858  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : memory
## names      :  Temperature,     Humidity,          PM1,        PM2.5,         PM10,        Crime, Temperature.log.density., Humidity.log.density., PM1.log.density., PM2.5.log.density., PM10.log.density.,      Poverty,      Traffic 
## min values :     9.885333,    18.841250,     0.000000,     0.345000,     0.252500,     1.000000,                 0.000000,              0.000000,         0.000000,           0.000000,          0.000000,     2.000000,    73.000000 
## max values :    38.350000,    82.879531,    10.311563,    68.921215,    21.718763,  1487.000000,                 3.723948,              3.723948,         3.725340,           3.725667,          3.725993,   100.000000, 72822.790747

localMap

Render a leaflet map display for a brick of local data rasters. Here, we create a map display for local.ras and project the boundaries of the counties selected above.

my.local.map <- localMap(local.ras, bounds = cty.shape)

tagList(my.local.map)

customEPAData

Subset the EPA.data according to custom parameters. Here, we include data within the counties selected above and measured between 1 June 2017 - 31 May 2019.

my.epa.data <- customEPAData(cty.shape, '2017-06-01', '2019-05-31')

head(data.frame(my.epa.data))
##         Date Longitude Latitude AQS_Site_ID        Local.Site.Name   State.Name
## 1 2017-06-01 -75.08083 39.99139 42-101-0048 North East Waste (NEW) Pennsylvania
## 2 2017-06-01 -75.08083 39.99139 42-101-0048 North East Waste (NEW) Pennsylvania
## 3 2017-06-01 -75.08083 39.99139 42-101-0048 North East Waste (NEW) Pennsylvania
## 4 2017-06-01 -75.08083 39.99139 42-101-0048 North East Waste (NEW) Pennsylvania
## 5 2017-06-02 -75.08083 39.99139 42-101-0048 North East Waste (NEW) Pennsylvania
## 6 2017-06-02 -75.08083 39.99139 42-101-0048 North East Waste (NEW) Pennsylvania
##      City.Name Monitor_Start_Date Last_Sample_Date PM2.5.ug.m.3. PM10.ug.m.3.
## 1 Philadelphia         2013-10-01       2020-09-30      7.637500           NA
## 2 Philadelphia         2013-10-01       2020-09-30      7.600000           NA
## 3 Philadelphia         2013-10-01       2020-09-30      7.637500           NA
## 4 Philadelphia         2013-10-01       2020-09-30      7.600000           NA
## 5 Philadelphia         2013-10-01       2020-09-30      7.066667           NA
## 6 Philadelphia         2013-10-01       2020-09-30      7.000000           NA
##   SO2.ppb. NO2.ppb. O3.ppb. CO.ppb.
## 1 1.140000       NA  41.059     0.3
## 2 1.140000       NA  41.059     0.3
## 3 1.116667       NA  41.059     0.3
## 4 1.116667       NA  41.059     0.3
## 5 1.225000       NA  29.118     0.3
## 6 1.225000       NA  29.118     0.3

intEPARaster

Create a brick of 1km x 1km raster layers rendered with inverse-distance-weighted interpolation of EPA data. Here, we use the EPA dataset and shapefile created above.

epa.ras <- intEPARaster(my.epa.data, cty.shape)

methods::show(epa.ras)
## class      : EPARasterBrick 
## dimensions : 98, 157, 15386, 6  (nrow, ncol, ncell, nlayers)
## resolution : 0.00901466, 0.009013171  (x, y)
## extent     : -76.13873, -74.72342, 39.72199, 40.60528  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : memory
## names      :      PM2.5,       PM10,        SO2,        NO2,         O3,         CO 
## min values :  7.9109829, 18.3353028,  0.3246053,  8.7085803, 25.0908216,  0.2633197 
## max values : 10.6409376, 18.3353028,  0.9282857, 11.2932587, 30.7403024,  0.4161934

EPAMap

Render a leaflet map display for a brick of interpolated EPA data rasters. Here, we create a map display for epa.ras and project the boundaries of the counties selected above.

my.epa.map <- EPAMap(epa.ras, bounds = cty.shape)

tagList(my.epa.map)